Letter-to-phoneme conversion by inference of rewriting rules
نویسنده
چکیده
Phonetization is a crucial step for oral document processing. In this paper, a new letter-to-phoneme conversion approach is proposed; it is automatic, simple, portable and efficient. It relies on a machine learning technique initially developed for transliteration and translation; the system infers rewriting rules from examples of words with their phonetic representations. This approach is evaluated in the framework of the Pronalsyl Pascal challenge, which includes several datasets on different languages. The obtained results equal or outperform those of the best known systems. Moreover, thanks to the simplicity of our technique, the inference time of our approach is much lower than those of the best performing state-of-the-art systems.
منابع مشابه
Using Rules to Improve Letter to Sound Conversion of Names
This paper presents an investigation of the use of context sensitive rewrite rules for improving the performance of data driven letter to sound conversion, concentrating on the specific case of British names. Taking a practical point of view, emphasis is put on reduction of the worst phonetization errors, and on improving the maintainability of the system helping in database cultivation, and al...
متن کاملDialect variation in Boro Language and Grapheme-to-Phoneme conversion rules to handle lexical lookup fails in Boro TTS System
It is not possible to include all the words in a natural language for general text-to-speech system. Grapheme-tophoneme conversion system is essential to pronounce a word which is out of vocabulary. Grapheme-to-phoneme rules play a vital role where lexical lookup fails. Though basic Grapheme-tophoneme rules system is very simple yet it is very powerful for naturalness of a TTS system. Letter-to...
متن کاملWelsh letter-to-sound rules: rewrite rules and two-level rules compared
In a text-to-speech synthesis system, input words not found in the system's lexicon are passed to letter-to-sound rules, which derive the word's pronunciation. In Welsh, the letter-to-sound rules must be applied in three passes: firstly, to add epenthetic vowels, secondly, to determine stress and vowel location, and thirdly, to perform grapheme-to-phoneme conversion. To begin with, all these le...
متن کاملAutomatic Discovery of Brazilian Portuguese Letter to Phoneme Conversion Rules through Genetic Programming
Letter to phoneme conversion is a basic step in Speech Synthesis processes. Traditionally, the activity involves the implementation of rules that define the mapping of letters into sounds. This paper presents results of the application of an evolutionary computation technique (Genetic Programming), in Brazilian Portuguese synthesis, aiming to discover automatically programs implementing specifi...
متن کاملHybrid Grapheme to Phoneme Conversion forUnlimited
Both dictionary-based and rule-based methods on grapheme-to-phoneme conversion have their own advantages and limitations. For example, a large sized phonetic dictionary and complex morphophonemic rules are required for the dictionary-based method and the LTS(letter to sound) rule-based method itself cannot model the complete morphophonemic constraints. This paper describes a grapheme-to-phoneme...
متن کامل